Upgrade Guide
This guide presents the steps to upgrade your Resolve Insights deployment to version 11.0.0.
The supported upgrade path is:
- from Insights v9.5 (all minor versions) to Insights v11.0.0
- from Insights v9.6 (all minor versions) to Insights v11.0.0
Prerequisites
Ensure that ALL System Requirements and Prerequisites are in place.
Back Up Your Environment
VM Snapshots
Make snapshots of all VMs running Insights nodes.
LDAPS Certificate
If you are using secure LDAP for UI access, back up the cacerts
file on all NCE nodes. Due to Java upgrades, you might need to restore it after the upgrade to preserve your changes.
- Find the file in the following location:
JAVA_HOME/lib/security/cacerts
- Use the following command to find the
JAVA_HOME
value (usually/usr/lib/jvm/java
):env | grep -i java
Nginx SSL Certificate Settings
Back up the nginx Insights configuration if you have:
- Customized the default SSL certificate settings in the nginx
meridian.conf
file - Set a password on the certificate
Files to back up:
/etc/nginx/conf.d/meridian.conf
/etc/nginx/conf.d/meridian.conf.template
For comparison, these are the default SSL certificate setting values:
ssl_certificate /etc/nginx/cert.pem;
ssl_certificate_key /etc/nginx/cert.key;
JVM Options
If you have changed the default JVM options for meridian-rest-service
or elasticsearch
, take note of your current values or back up the following files for reference:
- meridian-rest-service:
/opt/meridian/etc/sv/meridian-rest-service/run
- elasticsearch:
/opt/meridian/elasticsearch/config/jvm.options
Upgrade Steps
To upgrade Insights, complete the sections and their steps in the provided order.
If the Insights application is initially installed under the default /opt/meridian
and you want to move it to another location as part of the upgrade process you would need to first:
- Stop all meridian services on all nodes;
- Backup your environment;
- Move the entire content of
/opt/meridian
into the new directory; - Make sure that the
meridian
user has access to the new directory; - Execute the pre-upgrade and the upgrade steps below in the new directory by replacing the
/opt
in each of the commands with the new directory.
Pre-upgrade Steps
Take the following steps on the indicated node types.
(All nodes) Transfer the installation package that you received from Resolve to
/opt/FS
or the actual directory where Insights will be installed.(All nodes) Unpack the installation package:
```bash
tar -xvzf ./meridian-<VERSION>.tgz
```noteThe following step is needed when the NCE and DC servers do not have access to the RPM repos.
(All nodes) Manually install chrpath and GCC packages.
yum install chrpath
yum install gcc(All DC nodes) Stop all Insights services:
service meridian stop
service meridian status(DC Master) Transform the DC data with discovery schedules to JSON for the upgrade.
- Start
mariadb
serviceservice mariadb start
service mariadb status - Back up the
netra_scheduler
database. Substitute the/opt
directory with a backup directory./opt/meridian/mariadb/current/bin/mysqldump -h 127.0.0.1 -u netra -p fixstream --databases netra_scheduler > /opt/netra_scheduler.sql
- Execute the script to transform the scheduler data into JSON files. Substitute the
/opt/FS
directory with the actual directory where the new version was unpacked.python /opt/FS/meridian-<VERSION>/meridian-tools/schedule_to_json.py
noteDisregard the following WARNING message.
/opt/meridian/dc/embedded/lib/python2.7/site-packages/secureconfig/cryptkeeper.py:5: CryptographyDeprecationWarning: Python 2 is no longer supported by the Python core team. Support for it is now deprecated in cryptography, and will be removed in the next release.
from cryptography.fernet import Fernet, InvalidToken - Check if the scheduler data in the database is transformed. The number of records should match the number of recurring network discoveries. The content of the
job_state
column should be in JSON format.dcmysql
MariaDB [netra_dc]> select * from netra_scheduler.apscheduler_jobs; - Stop
mariadb
serviceservice mariadb stop
service mariadb status
- Start
(All NCE nodes) Stop all Insights services:
service meridian stop
service meridian statusIf, after running these commands, you still see running Insights processes, use the following command to forcefully stop them.
kill -9 <list of process ids separated with space>
(All NCE nodes) Start Elasticsearch:
service elasticsearch start
(All NCE nodes) Take the following steps on all NCE nodes, one at a time, starting from the Worker nodes and finishing with the Master.
- Disable shard allocation:Where
curl -X PUT -k -u admin:FixStream "https://<node IP>:9200/_cluster/settings?pretty" -H 'Content-Type: application/json' -d'{"persistent": {"cluster.routing.allocation.enable": "primaries"}}'
<node IP>
is the current NCE node. - Stop non-essential indexing and synced flush.
Ignore any errors that it might throw.Wherecurl -X POST -k -u admin:FixStream "https://<node IP>:9200/_flush/synced?pretty"
<node IP>
is the current NCE node. - Stop Elasticsearch:
service elasticsearch stop
- Disable shard allocation:
(All nodes) Uninstall the old RPM Meridian packages. Navigate to a directory where there are NO
meridian-XXX
files or directories, for example, navigate to the/
directory.cd /
yum remove meridian-*
Configuring the Deployment
Before you start the installation, you need to provide details about your environment and the deployment type. Use the deploy.config file to provide those details.
Prepare a deploy.config
file for each node that you will have in your Insights cluster. Each configuration setting in this file is labeled with the node type or types that it applies to.
On nodes where the setting is not required, keep its default value. Do not remove the line and do not delete the value.
Repeat these steps on each of the VMs that will be part of the deployment environment and take note of which node type you're configuring.
(All nodes) Open
deploy.config
for editing: Replace/opt/FS
in the next command with the directory you created for Insights.vi /opt/FS/meridian-<VERSION>/meridian-tools/deploy.config
(All nodes) Update the variables in the
deploy.config
file as per the node role:#############################################################
# Necessary Input Parameters
#############################################################
### ESSENTIAL ###
VERSION="11.0.0" //(All) Build number that is being deployed
NCE_MASTER_IP="1.1.1.1" //(All) NCE Master IP in IPv4 or IPv6 format
NCE_MASTER_HOSTNAME="test-nce1" //(All) Hostname of NCE Master
DEPLOYMENT_PATH="/opt/FS" //(All) The installation directory that you created
################
### Needed for NCE-Cluster setup ###
NCE_WORKERS="1.1.1.2, 1.1.1.3" //(All) NCE Workers IP in IPv4 or IPv6 format
NCE_HOSTS="test-nce2,test-nce3" //(All) Hostname of NCE Workers
### DC Installation ###
### ESSENTIAL for DC ###
DC_MASTER_IP="1.1.1.4" //(DC nodes only) DC Master IP in IPv4 or IPv6 format
DC_MASTER_HOSTNAME="test-dc1" //(DC nodes only) Hostname of DC Master
KAFKA_DC_GROUP_ID="Group1" //(DC nodes only) Name of cluster group to add DC nodes in. You cannot change this name after the deployment.
#####################
DC_HOSTS="test-dc2,test-dc3" //(DC workers only) Hostname of DC Workers
DC_WORKERS="1.1.1.5,1.1.1.6" //(DC workers only) DC Workers IP in IPv4 or IPv6 format
### GENERAL PARAMETERS ###
UPGRADE_FLAG="true" //(All) Set to false if running a new installation
QUIET_FLAG=0 //(All, optional) Set to 1 to suppress any confirmation prompts during deployment
LOG_LEVEL="WARN" //(All, optional) Set the log verbosity level - OFF, DEBUG, ERROR, FATAL, INFO, WARN, TRACE
ALL_IN_ONE=0 //(All) Set by default to cluster (0) installation type. Set to all-in-one (1) installation type for POCs
IGNORE_PRE_REQ_CHECK=0 //(All, optional) Set to 1 to ignore prerequisites like ulimit and ICMP reachability
FIREWALL_FLAG="true" //(All, optional) Set to false if no firewall is configured in the environment to ignore the firewall-related steps
MERIDIAN_USER="meridian" //(All, optional) Set the Unix username for deployment
MERIDIAN_GROUP="meridian" //(All, optional) Set the Unix group name for deployment
MERIDIAN_HOME="/opt/meridian" //(All, optional) Set the installation directory
SHARED_PATH="${MERIDIAN_HOME}/share" //(All, optional) Set the directory used for sharing files within the Insights setup
localhost_ip="1.1.1.1" //(All, optional) IPv4: Set to preferred IP in case of multiple IPs. IPv6: Always set
is_localhost_ip_overridden="false" //(All, optional) IPv4: Set to true in case of multiple IPs. IPv6: Always set
NON_ROOT_FLAG=0 //(All, optional) Set to true (1) if the Unix username does not have root permissions. The setup will use the prefix from the access level to execute the root commands
SUDO_MODE="sudo" //(All, optional) Set the access level - sudo or dzdo
Deploying the New Version
After preparing deploy.config
files for all node types that you will have, you are ready to start the deployment.
Deploying the NCE
- (All NCE Worker nodes) To start the installation on an NCE worker node, run the deployment script with the deploy.config file as an input. Substitute the
/opt
directory with the actual directory where Insights will be installed./opt/FS/meridian-$VERSION/meridian-tools/meridian-deploy -f /opt/FS/meridian-$VERSION/meridian-tools/deploy.config
- (All NCE Worker nodes) After a node deployment completes as evidenced by the success message, ensure that all Insights services are up and running:
service meridian status
- (NCE Master) To start the installation on an NCE Master node, run the deployment script with the deploy.config file as an input. Substitute the
/opt
directory with the actual directory where Insights will be installed./opt/FS/meridian-$VERSION/meridian-tools/meridian-deploy -f /opt/FS/meridian-$VERSION/meridian-tools/deploy.config
- (All NCE nodes) Re-enable shard allocation:
curl -X PUT -k -u admin:FixStream "https://<node IP>:9200/_cluster/settings?pretty" -H 'Content-Type: application/json' -d'{"persistent": {"cluster.routing.allocation.enable": null}}'
- (All NCE Worker nodes) Delete the Storm-related data from Zookeeper on all NCE Workers:After deleting the data from the first worker node, a message that the data is already deleted may appear when you try to delete the data from the next worker node. In that case, ignore the message.
zkCli
deleteall /meridian-storm - (NCE master) Ensure that Kibana is running.
- Check the service's status:
service kibana status
- If Kibana is not running, delete the
.kibana
index on the NCE master and restart Kibana:curl --tlsv1.2 -k -w '%{http_code}' -X DELETE -H 'Content-Type: application/json' -u admin:FixStream https://<NCE master IP>:9200/.kibana?pretty
service kibana stop
service kibana start
- Check the service's status:
- (NCE Master) Remove the Trial license:
- Check the Insights login screen. If it indicates 30 days remaining, this means that the trial license has replaced your license and you need to delete it. Otherwise, skip the rest of the steps.
- In Kibana, go to Dev Tools > Console and run the following query against Elasticsearch to get the list of licenses:
GET license_config/_search
{
"sort": [
{
"lastUpdateTime": {
"order": "desc"
}
}
]
} - Check the timestamp of the first document. If it was uploaded today, take note of the document ID.
- From the Kibana Console, run the following query to delete the document.Where
DELETE license_config/license/<License ID>
<License ID>
is the ID that you took note of.
- (NCE Master) Check if the last modification time of all files inside these directories is recent:
/opt/meridian
/opt/meridian/meridian-rest-service/<version>/
- (All NCE nodes) Restart the rest service:
service meridian-rest-service stop
service meridian-rest-service start - (NCE Master) Start all topologies:
service meridian start-top
At the very end of the NCE deployment, you might see the following message:
Could not upload ssot data using ssot_data.json file
Following curl command was used
curl -k --tlsv1.2 -XPUT -u admin:****** -H .....
The response http code was 400
See response body at /opt/meridian/es_response.json
Press Y to try again or no to exit.(Y/N).
If you select N the ssot backup will quit and you will have to manually perform it
This message is not relevant to the upgrade. You can complete the deployment successfully by pressing Ctrl+C at the prompt. Note that pressing Y or N does not yield any result.
Deploying the DC
Take the following steps to deploy the new build on your DC nodes:
- (DC Master) To start the installation on a DC Master node, run the deployment script with the
deploy.config
file as an input. Substitute the/opt
directory with the actual directory where Insights will be installed./opt/FS/meridian-$VERSION/meridian-tools/meridian-deploy -f /opt/FS/meridian-$VERSION/meridian-tools/deploy.config
- (DC Master) Ensure that the deployment of the DC Master is complete and that all Insights services are up and running on it:If any of the services is down, start it with the following command (except for
service meridian status
service meridian-dc statuslogstash
):service <service_name> start
noteStart all services that are down except for
logstash
. By default,logstash
is not running and it should remain stopped. - (All DC Workers) To start the installation on a DC worker node, run the deployment script with the
deploy.config
file as an input. You can deploy DC workers in parallel. Substitute the/opt
directory with the actual directory where Insights will be installed./opt/FS/meridian-$VERSION/meridian-tools/meridian-deploy -f /opt/FS/meridian-$VERSION/meridian-tools/deploy.config
- (All DC Workers) Ensure that the deployment of all DC Workers is complete and that all Insights services are up on them.If any of the services is down, start it with the following command (except for
service meridian status
service meridian-dc statuslogstash
):service <service_name> start
noteStart all services that are down except for
logstash
. By default,logstash
is not running and it should remain stopped.
Post-upgrade Steps
- (All nodes) Restore the
cacerts
file if it was modified:- Check if the file was modified during the upgrade:Where you can use the
ls -la `JAVA_HOME/lib/security/cacerts`
env | grep -i java
command to find the value ofJAVA_HOME
.
If the file's modification date shows today's date, then the file was modified by the upgrade. - Overwrite the file with the backup copy that you made as part of the prerequisites.
- Check if the file was modified during the upgrade:
- (All nodes) Restore the nginx configuration:
- Stop nginx:
systemctl stop nginx
- Overwrite the following file with the backup copy that you made as part of the prerequisites.
/etc/nginx/conf.d/meridian.conf.template
- Start nginx:
systemctl start nginx
- Stop nginx:
- (All nodes) Restore the JVM options (if you are using non-default options for
meridian-rest-service
orelasticsearch
):- For
meridian-rest-service
, take these steps:- Overwrite
/opt/meridian/etc/sv/meridian-rest-service/run
with the backup copy that you made as part of the prerequisites or make the changes manually. - Restart the service:
systemctl restart meridian-rest-service
- Overwrite
- For
elasticsearch
, take these steps:- Overwrite
/opt/meridian/elasticsearch/config/jvm.options
with the backup copy that you made as part of the prerequisites or make the changes manually. - Do a rolling restart of Elasticsearch. See the official documentation for detailed steps.
- Overwrite
- For
- (Only if ITSM is set to Cherwell) Refresh the Cherwell business object schemas:
- Log in to Insights as admin and navigate to Settings > ITSM Configuration.
- Click the Refresh button.
- (NCE Master) Restart the scheduler topology service:
service meridian-scheduler-topology stop
service meridian-scheduler-topology start
service meridian-scheduler-topology status
Preventing Services No Longer Used from Starting
The following services are no longer used in Insights:
consul
meridian-notification-service-worker
meridian-notification-service-master
redis
To prevent them from starting, take these steps:
- Stop the services on the respective nodes where they are running:
- All nodes
service consul stop
- NCE Master
service meridian-notification-service-worker stop
service meridian-notification-service-master stop
service redis stop
- All nodes
- Remove the symbolic links for the services from the
/etc/service
directory.
Troubleshooting the Upgrade
NCE Master Upgrade Fails
The NCE Master node deployment might fail on the following step:
Error executing action `run` on resource 'bash[upload mapping and data files from meridian-core]'
- Execute the following command
curl --tlsv1.2 -k -w '%{http_code}' -X DELETE -H 'Content-Type: application/json' -u admin:FixStream https://<node_ip>:9200/_template/event_metadata_template?pretty
- Rerun the upgrade.
DC Upgrade Fails
The deployment of a DC node might fail on the following step due to contention:
Error executing action `run` on resource 'execute[install meridian-dc common package]'
If that happens, take these steps:
- Stop all services on the DC node that failed installation:
service meridian stopf
- Rerun the upgrade.
You might have to rerun it several times until it succeeds.
If it still fails after several retries, contact your Resolve representative.
Kibana Does Not Start After Upgrade
If Kibana cannot start after the upgrade, try the following solution on the NCE Master:
- Delete the
.kibana
index on the NCE Master:Where:curl --tlsv1.2 -k -w '%{http_code}' -X DELETE -H 'Content-Type: application/json' -u admin:FixStream https://<nce master>:9200/.kibana?pretty
admin:FixStream
are your Elasticsearch username and password<nce master>
is the IP address of the NCE Master
- Restart the Kibana process:
service kibana stop
service kibana start
Elasticsearch Upgrade Fails
In case any upgrade step relating to Elasticsearch fails on an NCE node due to user authentication issues, like not being able to authenticate user admin/FixStream
, then take the following remediation steps:
Create user admin using the following command:
/opt/meridian/elasticsearch/bin/elasticsearch-users useradd admin -p <password> -r superuser
Where
<password>
is the password that you want to assign to the new account.Replace
/opt/meridian
with the actual path of your Insights deployment.Rerun the upgrade.
Applications Missing after Upgrade
If, after upgrading to the latest Insights release, you find that some or all of the applications are not visible, take the following steps to restore them:
Log in to Kibana at
https:<NCE>/kibana
.Go to DevTools > Console. Use the console to run the queries in the rest of the steps.
Acquire the organization ID:
- Get the IDs of all Insights organizations:
GET organization_meridian_config/_search
- Take note of the
orgId
attribute of the organization where you are restoring applications. For example, in the following output, theorgId
of the RESOLVE organization is642d1632-623c-4305-a1ab-a4f97cf83249:RESOLVE
.{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 1.0,
"hits": [
{
"_index": "organization_meridian_config-000001",
"_type": "organization",
"_id": "642d1632-623c-4305-a1ab-a4f97cf83249",
"_score": 1.0,
"_source": {
"lastUpdateTime": "2022-09-19T06:22:15.268Z",
"orgId": "642d1632-623c-4305-a1ab-a4f97cf83249:RESOLVE",
"orgName": "RESOLVE"
}
},
{
"_index": "organization_meridian_config-000001",
"_type": "organization",
"_id": "2793f428-0780-4f8b-97e3-bdda6136f160",
"_score": 1.0,
"_source": {
"lastUpdateTime": "2022-10-13T11:35:59.355Z",
"orgId": "2793f428-0780-4f8b-97e3-bdda6136f160:TEST",
"orgName": "TEST"
}
}
]
}
}
- Get the IDs of all Insights organizations:
Acquire a list of all applications in Elasticsearch:
GET application_meridian_topology/_search?size=1000
{
"query": {
"bool": {
"must": [
{
"match": {
"orgId": "<organization-id>"
}
}
]
}
}
}Where:
_search?size=1000
— Alter the search size depending on the amount of applications that you have.<organization-id>
— The organization ID that you acquired.
Search the output for
applicationName
and take note of the applications that you don't see in your Insights user interface. More specifically, take note of these attributes:- applicationName
- applicationId
For every missing application, run the following query to restore the application:
POST application_meridian_topology/application/<organization ID>:<application ID>
{
"lastCMDBUpdateTime": null,
"isSystemCreated": false,
"level": 0,
"groupId": "default",
"levelName": "Application",
"applicationId": "<application ID>",
"type": "APPLICATION",
"orgId": "<organization ID>",
"applicationName": "<application name>",
"lastUpdateTime": "<current time>"
}Where:
<organization ID>
— The ID of the organization where you are restoring the application.<application ID>
— The ID of the missing application.<application name>
— The name of the missing application.<current time>
— The current date and time. For example, 2022-09-13T22:19:19.334Z.
Elasticsearch Fails to Start
The upgrade process requires you to stop and restart Elasticsearch in several places. In case Elasticsearch fails for restart, this might be due to it exhausting the open file descriptor limit set on the current user while trying to allocate shards for existing metric-related indexes.
If Elasticsearch hits that limit and shows an error, the solution is to increase the open file descriptor limit for the user running Insights.
Take these steps on the NCE Master to increase the number of allowed open file descriptors:
- Stop Elasticsearch
service elasticsearch stop
- Open the
meridian.conf
limits file:vi /etc/security/limits.d/meridian.conf
- Update the
meridian -nofile
line to a higher number and save the changes.
Use a new value of 256000 or higher. - Start Elasticsearch:
service elasticsearch start
- Check if the limits have applied.
The result of each command must be the new, higher number that you entered.ulimit -Hn
ulimit -Sn - Wait until Elasticsearch is up and running and validate the status and pending tasks.
- Execute the following command to get the status of Elasticsearch:Where
curl --tlsv1.2 -s -k -w '%{http_code}' -X GET -H 'Content-Type: application/json' -u <username>:<password> https://<NCE-Master-IP>:9200/_cluster/health?pretty
<username>
and<password>
are the credentials for Elasticsearch/Kafka, and<NCE-Master-IP>
is the IP of the Master NCE node. - In the output, verify that Elasticsearch has finished.
In case the following output lines don't show the specified values, give Elasticsearch more time and then check again.- Verify that
status
isgreen
oryellow
. - Verify that
number_of_pending_tasks
shows 0.
- Verify that
- Execute the following command to get the status of Elasticsearch:
- Re-run the upgrade.
Optionally, you can bring the ulimits back down to the previous level after successfully completing the upgrade.